System and method for accessing different types of back end data stores

ABSTRACT

The present discloses a framework that allows a synchronization engine to synchronize data between a mobile device and Back End data stores independently from the architecture and data formats of that Back End data store. The framework introduces content adapters, which access synchronization data from backend data systems. These adapters convert the data into a Back End data store independent representation, which can be used by all applications or modules which need to access different back-ends in a generic manner. A generic synchronization engine for the purpose of conflict detection and resolution is one example for a module of this kind. Other applications that could use the content adapter are Notification Frameworks or Portals and all other applications aggregating data.

The present invention relates to a method and system for exchange orsynchronization of data between different clients, and in particular tosynchronization of data between clients by using a centralsynchronization server linked with different types of Backend datastore.

Synchronization can be defined as keeping data consistent betweendifferent clients, e.g. a Notebook calendar and Personal DigitalAssistant (PDA) calendar, and storing at these clients data intodifferent data formats with different identifiers. The synchronizationdata can also consist of complete computer programs or parts of theseprograms.

Frequently, a central synchronization server is used through which datacan be exchanged between different clients (see FIG. 1). The centralsynchronization server can either store all data locally (i.e. whenusing a proprietary data format), which is normally used by carriers(e.g. Yahoo) with high loads, or can directly access Backend data storeslike DB2, Domino, or Exchange (see FIG. 2).

Back End data stores have their own protocols and mechanisms to accessand store information. Although relational databases use ODBC as acommon interface, for instance databases storing Personal InformationManagement (PIM) data are usually accessed in a proprietary way and relyon very specific data structures of the content. This results in adependency of Web-Server based applications on the specific databasesand the particular types of the accessed content. It requires bigefforts when adopting new databases and new types of content for anexisting software.

For instance, developing a big scale synchronization solution, whichinvolves many different database manufacturers and multiple multimediacontent is quite difficult under these circumstances: Thesynchronization engine, which includes the logic for synchronizingmultiple client devices, needs to be adapted for each supported type ofcontent and each connected database. In fact, today's synchronizationengines depend strongly on the backend store, which contains the data.Significant investments for implementing a sophisticated sync engine isfor the benefit of one or few backend systems only (see FIG. 3)

The same problem appears if a Notification System wants to inform a userby sending data which is collected and stored in different backendsystems. Also portals which aggregate data from different systems need asystem independent method to access the information.

U.S. Pat. No. 5,974,238 describes an apparatus for performing dynamicsynchronization between data stored in a handheld computer and a hostcomputer, each having a plurality of data sets including at least onecommon data set, each computer having a copy of the common data set. Thehandheld computer has a processor, a communication port, and a datasynchronization engine. The data synchronization engine has apseudo-cache and one or more tags connected to the pseudo cache. Data issynchronized whenever data is written to main memory and/or when theassociated pseudo-cache tag is invalidated. By strict adherence to a setof protocols, data coherency is achieved because the system always knowswho owns the data, who has a copy of the data, and who has modified thedata. The data synchronization engine resolves any differences in thecopies and allows the storage of identical copies of the common data setin the host computer and in the handheld computer.

This prior art patent is not directed to a synchronization architectureusing a Back End data store. Therefore, the above mentioned problemsrelated to the Back End data store are neither dealt nor solved by thatpatent.

It is therefore object of the present invention to provide a new methodand system for exchange or synchronization of data in an architectureusing a central synchronization server linked to different Back End datastore types however avoiding the disadvantages of the prior artsolutions.

That object is solved by the features of the independent claims. Furtherpreferred embodiments are laid down in the dependent claims.

The present invention discloses a framework that allows asynchronization engine to synchronize data between a mobile device andBack End data stores independently from the architecture and dataformats of that back-end store. The framework introduces contentadapters, which access synchronization data from Back End data stores.These adapters convert the data into a Back End data store independentrepresentation, which can be used by all applications or modules whichneed to access different back-ends in a generic manner. A genericsynchronization engine for the purpose of conflict detection andresolution is one example for a module of this kind. Other applicationsthat could use the content adapter are Notification Frameworks orPortals and all other applications aggregating data. Any Back End datastore specific issues are handled by the Back End dependent part of thecontent adapters, which can easily provided by third parties and pluggedinto the framework.

The present invention will be described in more detail with theaccompanying drawings in which:

FIG. 1 shows a simplified synchronization architecture on which thepresent invention may be based,

FIG. 2 shows the prior art synchronization architecture with directaccess to the Back End data store,

FIG. 3 shows the prior art synchronization architecture with differentBack End data stores,

FIG. 4 shows the SyncML communication protocol which may be preferablyimplemented by the present invention,

FIG. 5 shows the basic architecture of the inventive content adapterframework (CAF),

FIG. 6 shows a preferred implementation of the CAF,

FIG. 7 shows the CAF interfaces,

FIG. 8 shows a communication flow between client, sync engine, CAF andbackend-system with session authentication,

FIG. 9 shows the inheritance model as used by the CAF, and

FIG. 10 shows the CAF specific process flow.

Synchronization between different clients using a centralsynchronization server is based on a synchronization protocol whichtypically consist of the following steps:

Pre-Synchronization: To prepare the actual synchronization some actionmust be taken before this can happen. These actions fall into thefollowing groups: authentication, authorization, and determine clientcapabilities. Authentication ensures that the server is who it claims tobe, and that the client is, who it claims to be. Authorization checks,whether the client is allowed to perform the requested action (e.g.delete, update, or only create new entries). Finally the serverdetermines the device capabilities (e.g. maximum buffer size) tooptimize the data flow to the client.

Synchronization: This is the part, where the synchronization data isexchanged. Between two synchronization partners all local IDs of dataentries are mapped to global IDs known to both partners. Every partnertherefore has a mapping table, to map local to global IDs. Then only theupdated, new or deleted entries are exchanged. If both partners updatethe same data entry there will be a conflict. This update conflict canbe resolved in different ways: try to merge the updates, duplicate theentries, let one entry win over the other, or simply do nothing andreport the conflict so that the user can solve it.

Post-synchronization: At post-synchronization all the cleanup tasks areperformed, like updating the mapping tables, report unresolvedconflicts, and so on.

A widely used synchronization protocol is SyncML. SyncML provides anXML-based transport protocol for synchronization that is independent ofthe transport protocol. Each synchronization Message is a XML-document.A typical SyncML system is shown in FIG. 4 where Application B (e.g.Personal Digital Assistant) is sending synchronization data via itsClient Sync Engine to the SyncML Framework. The SyncML Frameworktranslates the API calls (e.g. Update, Create) and the data into a validSyncML document and sends it to the Server. On the Server side theSyncML Framework receives the document, parses it, and then sends thecommand and data to the Server Sync Engine, which then talks to theApplication A (e.g. Lotus Notes Calendar of a notebook).

FIG. 5 shows the basic architecture of the inventive content adapterframework (CAF) used in a communication architecture between mobileclients and different Back End data store types.

The different mobile Clients 2,4,6 access the Sync Engine 12 via awireless or wired gateway 8 and through a Web Server 10 and the SyncEngine 12 talks via CAF 20 to the different Back End data store types24, 26. The CAF 20 provides the infrastructure to access data ofdifferent Back End data stores 24, 26 through a single backend neutralinterface (CAF-interface 22) and to easily add new Back End data stores.The CAF 20 consists of at least a single CAF-interface 22 and one ormore content adapters 28, 30.

The CAF-interface 22 represents a single interface for the Sync Server10 to access Back End data and therefore separates the content retrievalfrom the Sync Server. Through the CAF-interface 22 a Sync Engine 12 isable to access content independently from a specific Back End data store24, 26. For the data exchange between CAF-interface 22 and Sync Server10 preferably Data Objects are used as data format.

Basically the content adapter provides all data store specificdependencies.

In preferred embodiment of the present invention each content adapter28, 30 includes an abstract Back End independent part and a Back Enddependent part. The Back End dependent part contains all data storespecific dependencies. It implements access to the synchronized Back Enddata and creates a data store independent representation of that data,which is provided to the Sync Engine 12 or application layer using theCAF interface 22. The CAF specific process flow is managed by the BackEnd independent part of the content adapter 28, 30. Back End independentpart provides the functionality common to all Back End data stores, e.g.queuing mechanism, communication handling.

In order to include semantic information, which can be used by theapplication (e.g. sync engine), a class hierarchy of common data objectsis defined: Special subclasses of data objects describe the typicalproperties of the supported types of data, e.g. address, calendar,multimedia information, relational data bases, etc. The propertiesdescribing a data object for a particular kind of information can betaken from common standards, such as vCard (standard format forexchanging business card information) for address book information orvCal (standard format for exchanging calendar information) for calendarentries. Applying XML allows even representing customer specificdatabases independent from a particular database.

Finally the framework 20 provides the infrastructure to easily integratea caching mechanism between the Sync Engine 12 and the Back End datastores 24, 26 for high volume systems or slow back-end systems (see FIG.6)

The content adapter provide fast read/write access, adaptable todifferent backend systems (e.g. Domino, DB2, Exchange), support multipleSyncML messages, always have consistent data, and adaptable to differentcontent formats.

The method carried out by the basic architecture of CAF may be brieflysummarized as follows: Client requests sync session with Sync Server.Server authenticates client and accepts sync session. Client sendsupdate to server. Sync server creates data objects and fills in theupdate received from the client. Sync server calls then CAF interfaceand hands over data objects. CAF selects the appropriate Back Endspecific part of the content adapter.

CAF calls the Back End specific part and passes the data objects to it.Back End specific part of the content adapter transforms data objects ina Back End specific format and calls Back End specific API (applicationprogramming interface).

One of the main advantages of the present invention compared with theprior art is making the access specific back-end databases independentfrom the calling application. CAF integrates components from differentdatabase providers and offers access to their database functionalitythrough a high-level interface.

By using common interfaces, the content adapters ensure interoperabilityof application with multiple Back End systems. A Sync Engine, forexample, does not depend on proprietary commands of a particulardatabase. Additionally, the components hide the complexity of thecontent, which is exchanged between a database and an application. Thissignificantly reduces the programming efforts and the complexity ofsolutions. Also CAF allows the backend end system provider to justdevelop one interface for different applications accessing that backenddata store. This saves both parties (the application provider and thebackend supplier) a hugh amount of time and money.

CAF allows:

-   -   sync engines to talk to different back-end systems using the        same protocol and API    -   back-end system providers to create their own content adapter        tailored towards their back-end system    -   provides a session between sync engine, CAF and the back-end        system handling the authentication to re-use back-end        connections for efficiency    -   allows the usage of a caching system to achieve low latency time        for the communication sync engine—CAF    -   allows for load-balancing and fail-save distribution of        components

FIG. 6 shows a preferred implementation of the CAF. The CAF 100comprises a CAF-interface 22, a Content Manager 30, and a cachingmechanism.

The CAF interface 22 provides a single interface for the Sync Server toaccess Back End data stores 24 and therefore separates the contentretrieval from the Sync Server 10.

The Content Manager 30 forwards authentication and backend managementrequests (e.g. get a sync anchor) to the Back End Manager 80, writes newdata to the cache 50 using the Persistent Store 40, and gets updatesfrom the cache 50 through the Persistent Store 40. Search and executecommands are performed on the Back End.

The caching mechanism provides a permanent cache 50 and a mechanism forbuffering of updates into the cache 50 and synchronizing bufferedupdates with the respective clients. The permanent cache 50 may be arelational data base like Oracle or IBM DB2 and may be accessed forexample via JDBC calls. The caching mechanism preferably consists of aCache Monitor 70, a Backend Monitor 60, a Back End Manager 80, and apersistent store 40. The Backend End Manger 80 includes an abstractBackend End Manager 80″ with its Back End specific parts 80′ (ContentAdapter), the Cache Monitor 70 includes an abstract Cache Monitor 70″with its Back End specific Cache Monitor parts 70′ (Content Adapter),and the Backend Monitor 60 includes an abstract Backend Monitor 60″ withits Back End specific Backend Monitor parts 60″(Content Adapter).

The Cache Monitor 70 is primarily used to replicate all new data fromthe cache to the back end data store. Depending on the Back Endrequirements different replication strategies, such as batch or tricklemay be adopted. If the primary objective is to better support syncclients instead of regular back end clients the batch mode is preferred.The Back End dependent part 70′ of the Cache Monitor 70 is specific foreach Back End data store and must exploit the features of the Back Enddata store (e.g. DB2, Domino). It also translates Sync Objects into acontent storage specific format (e.g. Lotus Domino or MS Exchange).

The Back End Monitor 60 trickles updates that occur in the Back End datastore 24 from outside the sync server (e.g. a regular Lotus Notes clientupdating a database) into the cache 50. This allows sync Clients tosynchronize always with the latest back end data without requiring theoverhead of a full replication for each sync session. The Back Endspecific part 60′ of the Back End Monitor 60 is specific for each BackEnd data store and translates the content storage specific format (e.g.Lotus Domino or MS Exchange) into CAF Sync Objects.

The Back End Monitor can have different update policies, includingaggressive or lazy updates, to optimize the overall system performance.

The Back End Manager 80 provides access to administrative functionalityof the Back End data store. The following functionality preferablyoffered for supporting the CAF: validation of user authentication,retrieval of access permissions for authenticated users, retrieval ofthe current back end specific timestamp (current “sync anchor”), andadding/removing the URIs CAF wants to monitor for changes. The Back Endspecific part 80′ of the Back End Manager 80 is specific for each BackEnd data store and translates the content storage specific format (e.g.Lotus Domino or MS Exchange) into CAF Sync Objects.

The data to be synchronized can either be stored directly in the remotecontent store (backend) or can be cached persistently on the server(locally) for performance reasons. The Persistent Store 40 uses apersistent storage medium as a cache to optimize read/write access tothe Back End data store, however, the architecture does not prevent thatthe Persistent Store directly connects to the backend data store via theCache 50 and Back End Monitor.

The communication flow within above CAF implementation may be brieflysummarized as follows:

The Content Manager 30 receives the requests from the Sync Engine 12 andforwards them either to the persistent store 40, if data needs to beretrieved or stored, or to the Back End Manager 80, if a timestamp isneeded or authentication is requested. The Back End Registry, ifavailable, contains all available Back End Managers 80 and Monitors 60,70 and is accessed from Content Manager 30 and Persistent Store 40. TheCache Monitor 70 gets updated data from the Persistent Store 40,translates these to the Back End format, and forwards the data to theBack End data store 24 by using the Back End dependent part 70′ of theCache Monitor. The Back End dependent part 60′ of the Back End Monitor60 receives the updates from the Back End data store 24, translates themto Data Objects, and forwards them to the Persistent Store 40.

In case the cache is not available or for a given database a directaccess to the Back End 24 is specified, the C6 ntent Manager 30 forwardsthe getUpdates call to the Back End Monitor 60 and the items to beupdated to the Cache Monitor 70. Both monitors will use the provideduser ID and password to access the backend.

Authentication Sync Engine with CAF

The sync engine has two options for authentication its requests to CAF

-   1. provide for each command the required backend user ID and    password-   2. request an authentication token from CAF for one backend by    providing a user ID and a password. Each subsequent command to CAF    for this sync session can be authenticate with this token (similar    to a LTPA token). The following token types may be supported:    read-only access, read/write access, or unrestricted read access,    but write must be authenticated each time at the Back End.    Authentication CAF to Back End Data Store

The required authentication level to the Back End data store is storedin Access Control Lists for each Back End data store and checked by theSync Adapter. Dependent on this list, CAF authenticates itself to theBack End data store either with a group user and group password validfor all user that update their data through the sync server or on a peruser basis.

FIG. 7 shows the interfaces (I/F) of the CAF in the preferredimplementation of FIG. 6.

The Sync Engine 12 uses the CAF interface 22 to access the Back End datastores in a generic way. To efficiently exchange data between genericSync Engine 12 and CAF interface 22, CAF interface 22 preferably usesthe raw value binary encoding scheme for data exchange. These raw dataare embedded as ActionData object in the Syncobj together with the CAFMeta data.

CAF Meta data are:

-   -   Timestamp represents the current sync anchor of the SyncObj. It        is stored as object of the class java.sql.Timestamp.    -   ActionType represents the data action type of the SyncObj.        Possible values are defined in class constants (CREATE, UPDATE,        DELETE). It is stored as short value.    -   GUID represents the backend specific global id of the SyncObj.        It is stored as java.lang.String value.    -   CUID represents the CAF specific cache id of the Syncobj. It is        stored as java.lang.String value.    -   databaseURI represents the URI of the backend database to which        the SyncObj belongs to. It is stored as object of the class        com.ibm.caf.URI.    -   userID represents the user id of the client that initiated the        synchronization to which the Syncobj belongs to. It is stored as        java.lang.String value.        The Interfaces of the CAF Specific Process Flow

The Abstract Monitor Class offers basic functionality for theintegration of Back End specific Monitors into the CAF architecture. Twoabstract classes called Abstract Cache Monitor and Abstract BackendMonitor are provided that can be used via inheritance for the creationof new Monitor classes (see FIG. 8).

The abstract classes provide the following functionality:

-   -   Queuing mechanisms for different update policies (trickle/batch        updates)    -   Registration/Deregistration at the Backend Registry    -   Reading und registration for updates at the System Preferences        Component    -   Handling of communication with CAF Persistent Store (Cache)

The CAF specific process flow is managed by the Abstract Backend Monitor60 and the Abstract Cache Monitor 70 through the interfaces defined inthis patent application. The direct communication with the Back End datastore 24 is implemented in Back End specific components 70′, 60′ thatare inherited from the Abstract Monitor classes (see FIG. 9)

Each Monitor implements an internal queue that enables different updateand data propagation policies. Dependent on the usage scenario, it maybe necessary to send updated data in groups to the backend, or in largetime intervals.

The Abstract Monitor component implements a configurable queuingmechanism that offers the following update policies:

Amount Trigger:

The Amount Trigger monitors the size of the internal queue. It willpropagate the collected items when a certain configurable threshold isreached. This update policy can be used for configuring a batch(threshold >1 item) or trickle (threshold=1 item) update mechanism inthe specific monitor.

Interval Trigger:

The Interval Trigger monitors the time that has been passed since thelast time of propagation. It will send the collected items when acertain configurable time interval is reached.

Combined Trigger:

The Combined Trigger utilizes both above-mentioned policies: It willpropagate the collected items whenever one of the triggers is activated.

Session Handling

The topic of session handling applies primarily to the Cache Monitor,which is part of the overall CAF processing. Session handling in theBackend Monitor is completely dependent on the backend-specificimplementation.

There are two different usage patterns for the Cache Monitor: with andwithout the CAF Cache. When the Cache is used, no session handling issupported at all. The updates are replicated asynchronously from thecache to the backend. Since access to the backend database is performedthrough a admin-like user account, pooling of connections is possiblefor all occurring updates.

In the case that no cache is used, session handling is performed withthe methods beginConnect ( ) and endConnect( ). All updates are groupedon a “per user” basis, which allows connection pooling for eachsynchronizing user with the given credentials.

FIG. 8 shows a synchronization flow for a two-way synchronization withsession authentication preferably applied by the inventive CAF.

A two-way synchronization between Client and Server is performed wherethe Client as updated item A,B and F, deleted C and created a new itemD. Via an external client (e.g. Notes client) E was created on the BackEnd and B, C and F were updated.

Package 1 from client sends the credential for the Back End data store.The Sync Engine forwards these credentials to the CAF for verification.CAF asks the responsible Back End to verify the credentials and returnsan authentication token valid for a synchronization session to the SyncEngine. This token needs to be included into any future request to theCAF for this synchronization session.

In package 3 the clients sends its updates to the Sync server. The SyncEngine requests the Back End updates from the CAF by presenting theauthentication token. Now the Sync Engine compares the lists, resolvesthe conflicts, and populates the updated entries to the CAF and client.CAF stores these updates in its cache and replicate these changes laterto the Back End system.

The Sync Engine without interaction of CAF does the handling of package5 with mapping table information.

1. (canceled)
 2. A system according to claim 22, wherein each of saidcomponents further comprises an abstract Back End independent part,wherein said abstract Back End independent part provides commonfunctionalities for use by all the Back End dependent parts.
 3. A systemaccording to claim 2, wherein each of said at least one back end datastore is assigned its own said component.
 4. A system according to claim22, wherein said exchange of data is synchronization of data.
 5. Asystem according to claim 2, further comprising a cache for permanentlybuffering of updates of said at least one back end data store and saidclients, and each said component comprises a caching mechanism forcontrolling and executing buffering updates into said cache andreplicating buffered updates to said respective clients and saidassigned back end data store.
 6. A system according to claim 5, whereinsaid caching mechanism has a Back End Monitor.
 7. A system according toclaim 5, wherein said caching mechanism includes a Cache Monitor.
 8. Asystem according to claim 6, wherein said caching mechanism furtherincludes a Back End Manager.
 9. A system according to claim 6, whereinsaid caching mechanism provides for each of said at least one back enddata store its own Back End Monitor, Cache Monitor, and Back End Managerwith its Back End dependent part and its abstract Back End independentpart.
 10. A system according to claim 5, wherein said caching mechanismfurther comprises a persistent store.
 11. A system according to claim 7,wherein said Cache Monitor replicates updates from said cache to theassociated one of said at least one back end data store in a batch or acontinuous trickle mode.
 12. A system according to claim 6, wherein saidBack End Monitor replicates updates between said cache and theassociated one of said at least one back end data store in a batch or acontinuous trickle mode.
 13. A system according to claim 5, wherein saidcache and said at least one back end data store are databases.
 14. Asystem according to claim 22, wherein said clients are mobile clients.15. A system according to claim 4, wherein SyncML is employed as asynchronization protocol.
 16. (canceled)
 17. A method according to claim23, wherein the back end specific part is inherited from an abstractback end independent part assigned to said back end data store. 18.(canceled)
 19. A method according to claim 23, wherein said data objectscontain meta data.
 20. A method according to claim 23, wherein asynchronization protocol used exclusively between said client and saidsynch server is SyncML and the update received by said synch server ispresented as XML documents.
 21. (canceled)
 22. A system for exchange ofdata between a plurality of clients and at least one back end data storeby using a central synchronization server having a connection to said,said clients generating data to be synchronized, said system comprising:a sync engine for performing synchronization with said centralsynchronization server and connected to said central synchonizationserver; a single back end neutral interface associated with andconnected to said sync engine; and a component assigned to each of saidat least one back end data store, each of said components comprising aback end dependent part having an interface with said single back endneutral interface and an interface with said assigned back end datastore.
 23. A method for synchronization of data, said method comprisingthe steps of: receiving a sync session request from a client;authenticating said client against a sync server; receiving an updatefrom said client; authenticating said client against a back end datastore via a content adaptable framework interface using a back endmonitor; creating data objects and filling in the update received fromsaid client by said sync server; calling said content adaptableframework interface and forwarding said data objects; selecting anappropriate back end specific part of a component assigned to said backend data store; transforming a content adaptable framework of said dataobjects into a back end specific format; and executing the update bycalling the back end specific part and passing the data objects to theback end specific part.